Real-time analytics on large dynamic graphs
نویسنده
چکیده
Title of dissertation: REAL-TIME ANALYTICS ON LARGE DYNAMIC GRAPHS Jayanta Mondal, Doctor of Philosophy, 2015 Dissertation directed by: Professor Amol Deshpande Department of Computer Science In today’s fast-paced and interconnected digital world, the data generated by an increasing number of applications is being modeled as dynamic graphs. The graph structure encodes relationships among data items, while the structural changes to the graphs as well as the continuous stream of information produced by the entities in these graphs make them dynamic in nature. Examples include social networks where users post status updates, images, videos, etc.; phone call networks where nodes may send text messages or place phone calls; road traffic networks where the traffic behavior of the road segments changes constantly, and so on. There is a tremendous value in storing, managing, and analyzing such dynamic graphs and deriving meaningful insights in real-time. However, a majority of the work in graph analytics assumes a static setting, and there is a lack of systematic study of the various dynamic scenarios, the complexity they impose on the analysis tasks, and the challenges in building efficient systems that can support such tasks at a large scale. In this dissertation, I design a unified streaming graph data management framework, and develop prototype systems to support increasingly complex tasks on dynamic graphs. In the first part, I focus on the management and querying of distributed graph data. I develop a hybrid replication policy that monitors the read-write frequencies of the nodes to decide dynamically what data to replicate, and whether to do eager or lazy replication in order to minimize network communication and support low-latency querying. In the second part, I study parallel execution of continuous neighborhood-driven aggregates, where each node aggregates the information generated in its neighborhoods. I build my system around the notion of an aggregation overlay graph, a pre-compiled data structure that enables sharing of partial aggregates across different queries, and also allows partial precomputation of the aggregates to minimize the query latencies and increase throughput. Finally, I extend the framework to support continuous detection and analysis of activitybased subgraphs, where subgraphs could be specified using both graph structure as well as activity conditions on the nodes. The query specification tasks in my system are expressed using a set of active structural primitives, which allows the query evaluator to use a set of novel optimization techniques, thereby achieving high throughput. Overall, in this dissertation, I define and investigate a set of novel tasks on dynamic graphs, design scalable optimization techniques, build prototype systems, and show the effectiveness of the proposed techniques through extensive evaluation using large-scale real and synthetic datasets. REAL-TIME ANALYTICS ON LARGE DYNAMIC GRAPHS
منابع مشابه
GraphIn: An Online High Performance Incremental Graph Processing Framework
The massive explosion in social networks has led to a significant growth in graph analytics and specifically in dynamic, time-varying graphs. Most prior work processes dynamic graphs by first storing the updates and then repeatedly running static graph analytics on saved snapshots. To handle the extreme scale and fast evolution of real-world graphs, we propose a dynamic graph analytics framewor...
متن کاملEvoGraph: On-the-Fly Efficient Mining of Evolving Graphs on GPU
With the prevalence of the World Wide Web and social networks, there has been a growing interest in high performance analytics for constantly-evolving dynamic graphs. Modern GPUs provide massive amount of parallelism for efficient graph processing, but the challenges remain due to their lack of support for the near real-time streaming nature of dynamic graphs. Specifically, due to the current h...
متن کاملAccelerating Dynamic Graph Analytics on GPUs
As graph analytics often involves compute-intensive operations, GPUs have been extensively used to accelerate the processing. However, in many applications such as social networks, cyber security, and fraud detection, their representative graphs evolve frequently and one has to perform a rebuild of the graph structure on GPUs to incorporate the updates. Hence, rebuilding the graphs becomes the ...
متن کاملStorage and Processing Systems for Power - Law Graphs
Large graphs abound around us – online social networks, Web graphs, the Internet, citation networks, protein interaction networks, telephone call graphs, peer-to-peer overlay networks, electric power grid networks, etc. Many reallife graphs are power-law graphs. A fundamental challenge in today’s Big Data world is storage and processing of these large-scale power-law graphs. In this thesis, we ...
متن کاملDynamic Visual Analytics—Facing the Real-Time Challenge
Modern communication infrastructures enable more and more information to be available in real-time. While this has proven to be useful for very targeted pieces of information, the human capability to process larger quantities of mostly textual information is definitely limited. Dynamic visual analytics has the potential to circumvent this real-time information overload by combining incremental ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015